Skip to content

BUG: Regression in chained getitem indexing with embedded list-like from 0.12 (GH6394) #6396

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 18, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Feb 18, 2014

closes #6394

@jreback jreback added this to the 0.14.0 milestone Feb 18, 2014
jreback added a commit that referenced this pull request Feb 18, 2014
BUG: Regression in chained getitem indexing with embedded list-like from 0.12 (GH6394)
@jreback jreback merged commit 5eda239 into pandas-dev:master Feb 18, 2014
@michaelaye
Copy link
Contributor

This bug is active in 0.13.1 since February (I mean that's when 0.13.1 was released). Don't you think breaking something simple as df.UVIS[0] warrants releasing a bugfix release?

python test_read.py
/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/scipy/io/idl.py:180: UserWarning: warning: empty strings are now set to '' instead of None
  warnings.warn("warning: empty strings are now set to '' instead of None")
0.13.1
Traceback (most recent call last):
  File "test_read.py", line 7, in <module>
    print df.UVIS[0]
  File "/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas-0.13.1-py2.7-macosx-10.6-x86_64.egg/pandas/core/series.py", line 493, in __getitem__
    return self._constructor(result,index=[key]*len(result)).__finalize__(self)
  File "/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas-0.13.1-py2.7-macosx-10.6-x86_64.egg/pandas/core/series.py", line 220, in __init__
    raise_cast_failure=True)
  File "/Users/maye/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/pandas-0.13.1-py2.7-macosx-10.6-x86_64.egg/pandas/core/series.py", line 2600, in _sanitize_array
    raise Exception('Data must be 1-dimensional')
Exception: Data must be 1-dimensional

@michaelaye
Copy link
Contributor

Ok, to be fair, this might be just my highly unusual data. The IDLSAV files are the weirdest creatures. Having 2D numpy arrays per dataframe cell now.

@jreback
Copy link
Contributor Author

jreback commented May 13, 2014

@michaelaye

this is a real special case that is not recommend in any event. To be honest I think we should ban this type of thing, but too late for that.

0.14.0 should be coming shortly anyhow

@michaelaye
Copy link
Contributor

Should we work on a better IDLSAV importer?

@jreback
Copy link
Contributor Author

jreback commented May 13, 2014

In [1]: import urllib2

In [2]: url = 'http://python4astronomers.github.com/_downloads/myidlfile.sav'

In [3]: open('myidlfile.sav', 'wb').write(urllib2.urlopen(url).read())

In [4]: from scipy.io.idl import readsav

In [5]: data = readsav('myidlfile.sav')
/usr/local/lib/python2.7/site-packages/scipy/io/idl.py:167: UserWarning: warning: empty strings are now set to '' instead of None
  warnings.warn("warning: empty strings are now set to '' instead of None")

In [6]: data
Out[6]: 
{'str': rec.array([ (12.520000457763672, -27.219999313354492, array([  8.69999981,   8.60000038,   9.60000038,  10.10000038,  11.5       ], dtype=float32))], 
       dtype=[(('ra', 'RA'), '>f4'), (('dec', 'DEC'), '>f4'), (('fluxes', 'FLUXES'), 'O')]),
 'x': array([ 0.        ,  0.33333334,  0.66666669,  1.        ,  1.33333337,
         1.66666663,  2.        ,  2.33333325,  2.66666675,  3.        ,
         3.33333325,  3.66666675,  4.        ,  4.33333349,  4.66666651,
         5.        ,  5.33333349,  5.66666651,  6.        ,  6.33333349,
         6.66666651,  7.        ,  7.33333349,  7.66666651,  8.        ], dtype=float32),
 'y': array([ 0.        ,  0.32745016,  0.62514514,  0.88137364,  1.09861231,
         1.2837956 ,  1.44363546,  1.58348906,  1.70741141,  1.8184464 ,
         1.91889644,  2.01052713,  2.0947125 ,  2.17253971,  2.24487901,
         2.31243849,  2.37579918,  2.43544436,  2.4917798 ,  2.54514909,
         2.59584522,  2.64412069,  2.69019413,  2.73425555,  2.77647233], dtype=float32)}

what do you do with the after getting the weird scipy object?

@michaelaye
Copy link
Contributor

file = 'FUV2005_195_19_52_08_UVIS_011EN_ICYEXO001_PRIME.sav'
tmp = readsav(file)
df = pd.DataFrame.from_records(tmp['datastruct2'])

Explanation: the dictionary of readsav has 3 keys, but only the first key contains all the data.
so check what keys are in the output object of readsav.

@jreback
Copy link
Contributor Author

jreback commented May 13, 2014

that's just an odd structure (but maybe you WANT to keep the data like that).

@michaelaye
Copy link
Contributor

that's true, I don't know how they created those files. But I could swear I saw exactly the same symptom before: 1 key useful, 2 keys just looking like the end markers of a stone-age C or FORTRAN structure. BTW, wouldn't it be cool if these github issue discussion box could be iPython notebook cells? ;)

@jreback
Copy link
Contributor Author

jreback commented May 13, 2014

It prob would pay to make these IDL into a useful python object (not just read it in), but allow manipulation of multiple of them (maybe this already exists somewhere)...

this is what I do when I have multiple 'things' that are related, separate the bits that are computation/vectorizable into Frames/Panels, etc. and wrap them in objects that are nice

@michaelaye
Copy link
Contributor

u mean to write a wrapper class that provides access to the data? Yes, for my case I'm planning to do that. I was just wondering if my case is maybe a standard case of an old saving format of IDL and if newer created structures are all more sane, like the one you found there. Have to understand how this was created. Ah, there might be a version string hidden inside somewhere..

@jreback
Copy link
Contributor Author

jreback commented May 13, 2014

yep that's what I mean

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indexing Regression in 0.13.0
2 participants